Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

Comments

Qualify C-style memory allocation functions as pure#1683

Merged
andralex merged 1 commit intodlang:masterfrom
nordlow:pure-stdlib-alloc
Nov 1, 2016
Merged

Qualify C-style memory allocation functions as pure#1683
andralex merged 1 commit intodlang:masterfrom
nordlow:pure-stdlib-alloc

Conversation

@nordlow
Copy link
Contributor

@nordlow nordlow commented Oct 24, 2016

In response to dlang/dmd#6197

Why does the FreeBSD_32 target fail?

@nordlow nordlow changed the title Qualify C-style memory allocations functions as pure Qualify C-style memory allocation functions as pure Oct 24, 2016
@andralex
Copy link
Member

cc @MartinNowak @WalterBright to make sure we're on the same page.

The faiure is caused by timelimit: sending warning signal 15 i.e. the test took too long. Please push again and it should work.

@nordlow
Copy link
Contributor Author

nordlow commented Oct 24, 2016

Pushed again.

@JackStouffer
Copy link
Contributor

If you log in to the auto tester, you can just deprecate the bad test

@nordlow
Copy link
Contributor Author

nordlow commented Oct 25, 2016

BTW: Shouldn't we qualify free as pure aswell?

@schveiguy
Copy link
Member

BTW: Shouldn't we qualify free as pure aswell?

I don't think so. free doesn't return anything, and accepts a mutable argument. Making this pure means that a function like the one below is likely to happen:

void foo(immutable T* ptr) pure
{
   ...
   free(cast()ptr);
   ...
}

The cast is likely to happen because "I know what I'm doing, and this needs to be freed at this point". It may even be in a destructor, done without the author's knowledge (think reference counting).

But look at the signature of foo! The compiler can look at that signature, and determine that there is no reason to call the function (it's a pure function that returns nothing, and has no side effects). Which might result in a memory leak.

@andralex
Copy link
Member

It's an interesting question. We could classify pure functions that return void as "nominally pure but don't optimize away". I'd say let's keep this in mind and explore the possibility if a practical need arises.

@schveiguy
Copy link
Member

We could classify pure functions that return void as "nominally pure but don't optimize away".

This could be a possibility. But then aren't we reducing the usefulness of pure functions? It would be quite easy to make a template that gets inferred as pure in some cases, and then I would want the compiler to optimize away needless calls.

@andralex
Copy link
Member

@schveiguy if it returns void and is pure... that's either a bug or a feature, but not a call you want optimized away :o). I know, it's zen.

@schveiguy
Copy link
Member

schveiguy commented Oct 25, 2016

For instance:

void popFrontN(R)(ref R range, size_t n) // inferred pure when range.popFront is pure
{
   foreach(i; 0 .. n) range.popFront;
}

struct Repeat(T)
{
   T val;
   T front() const { return val; }
   void popFront() const {} // inferred pure
   enum empty = false;
}

immutable foo = Repeat!int(5);
foo.popFrontN(1_000_000); // please elide this

Sure, a contrived example. But these kinds of things happen all the time with generic code. I'm expecting the compiler to elide the calls that I would have if I wrote it by hand instead of using a generic template.

@MetaLang
Copy link
Member

@schveiguy I did agree with this but the more I think about it, the more I think that the compiler eliding function calls based on critieria that are inferred is a terrible, terrible idea. It's just ripe for wrongly-elided calls and I'm surprised we haven't had more reports of it happening already (probably because much of Phobos is impure).

@nordlow
Copy link
Contributor Author

nordlow commented Oct 26, 2016

@schveiguy your code example with cast is not allowed in @safe mode. What about fixing DMD so that it only performs these optimizations on calls to @safe strongly pure functions?

@WalterBright perhaps this already works in this way? If so I guess a unittest is inplace to verify that DMD does not optimize away the second call to a pure-qualified free for the same argument.

Update: What about the purity of Mallocator?

@nordlow
Copy link
Contributor Author

nordlow commented Oct 29, 2016

Ping, @andralex ! Should I add a unitttest?

@ibuclaw
Copy link
Member

ibuclaw commented Nov 1, 2016

Not this again.

@nordlow
Copy link
Contributor Author

nordlow commented Nov 1, 2016

@ibuclaw what do you mean?

@andralex
Copy link
Member

andralex commented Nov 1, 2016

@ibuclaw we've gotten to a better understanding of the matters involved and fixed a bug in the compiler such that pure memory allocation functions are always considered weakly (not strongly) pure.

@andralex
Copy link
Member

andralex commented Nov 1, 2016

Auto-merge toggled on

@JackStouffer
Copy link
Contributor

I'm really starting to lose track of what pure means anymore. We're diving head first into C++ style complexity here.

@schveiguy
Copy link
Member

the compiler eliding function calls based on criteria that are inferred is a terrible, terrible idea.

This argumentation works until you start peeling layers. The elided call doesn't have to be an inferred pure one.

It's also not practical -- I may use a template because I want to write my code once for many types. I don't want to have to repeat this identical code in a pure and non-pure template, just because I want to have efficient generated code.

In any case, my dire prediction is that marking free as pure (which, BTW, this PR does not do) would lead to memory heisenbugs, but I could also be wrong, and everything will be fine 😉

@JackStouffer The explanation of D pure is simple. The benefits the optimizer enjoys are what is complex. And the user doesn't generally have to be aware of this, they just know adding pure makes your code run faster. That's OK as long as all the stuff that is "assumed" pure is effectively pure. Resource allocation gives us kind of a way to cheat, because before the resource is allocated, it may be part of some global state, but we can consider that state to be outside the scope of the main executable. As long as we are consistent and rigorous about that assumption, the ruse works.

@JackStouffer
Copy link
Contributor

This really feels like cheating that will come back to bite us one day. I can see a DIP being submitted within a year to create a strongpure attribute.

That's OK as long as all the stuff that is "assumed" pure is effectively pure.

I'm genuinely asking, what on Earth does "effectively pure" even mean?

Conceptually what is the difference between making calloc pure and making the creation of a new file pure? They both do IO, they can both overwrite existing data, and their impure failures are relegated to rare edge cases.

The edge cases are always the most important focus when creating a language feature because the simple stuff almost always will work. It's the edge cases which can shoot you in the foot. So when your function call is only pure normally and is impure in the edge cases, that's an impure function.

The explanation of D pure is simple.

Let's try to write one with this change in mind:

A pure function is a function which does not read or modify anything outside of it's scope such as reading a global variable or doing IO. But modifying function parameters which are taken by ref is allowed, which allows one to modify a scope outside of it's scope. Pure disallows IO, unless you're in a debug statement and unless you're doing memory allocation, which also does IO and if you do too much of it your program isn't pure because malloc will eventually return null, but this is still allowed.

Short version: pure gives you guarantees 99% of the time. All other times, pure functions are no more pure than impure functions.

This is a C++ style explaination if I ever saw one.

I don't think this change is particularly bad, it's just this is another step in a trend in D of features that no one really knows how they work. See inout, return, return ref, scope, and now pure.

@jpf91
Copy link
Contributor

jpf91 commented Nov 1, 2016

Maybe a stupid question, but could somebody please explain why

void*   malloc(size_t size) pure;

is weakly (and not strongly) pure? AFAIR a method is only weakly pure if any parameter can cause side-effects (i.e. one parameter needs to have indirections and be mutable). Or does

fixed a bug in the compiler such that pure memory allocation functions are always considered weakly (not strongly) pure.

mean you've hardcoded a list of such functions in the compiler?

@nordlow
Copy link
Contributor Author

nordlow commented Nov 1, 2016

@JackStouffer, calloc does not overwrite existing data visible by others parts of the program. Your so called calloc-I/O is not observable by any other than the caller.

But I agree on one point; the definition of weakly pure has to be extended to nullary functions that return non-immutable references.

@schveiguy
Copy link
Member

@jpf91 See discussion and explanations here: https://issues.dlang.org/show_bug.cgi?id=15862

@schveiguy
Copy link
Member

I'm genuinely asking, what on Earth does "effectively pure" even mean?

If you imagine, the pure world-view is that before something is created, it never existed. However, practically, it must already exist (a computer is a finite system, and everything requested by a program exists before it's requested). So we need a place to cheat, but still present an effectively pure view of the world.

It means that you cannot see the difference between receiving a piece of data that was once part of a global pool (e.g. visible only by the GC), and one that never existed before.

Think of pure-driven memoization. Where is the memoized data stored? Has to be in a global pool, right? But yet, the function that uses it is pure, because the caller cannot tell the difference.

In your file example, I would say creating a NEW file that is locked and/or cannot be seen by any others is "effectively" pure (and so will all i/o for it be pure), but opening a file that exists or opening a file that others can open, etc. cannot be pure.

@jpf91
Copy link
Contributor

jpf91 commented Nov 1, 2016

@schveiguy thanks. I think we really need some updated strong/weak pure documentation though, AFAIK the official reference on dlang.org does not even mention weak vs strong pure.

@schveiguy
Copy link
Member

AFAIK the official reference on dlang.org does not even mention weak vs strong pure

And it doesn't have to. The implications of weak vs. strong pure are compiler optimization details. Only the rules need to be described.

That being said, I think we should describe and define the terms somewhere on the site.

@andralex
Copy link
Member

andralex commented Nov 1, 2016

@nordlow would you consider updating the docs please?

@nordlow
Copy link
Contributor Author

nordlow commented Nov 1, 2016

I'll give it a try, @andralex

@andralex andralex merged commit 9997ce5 into dlang:master Nov 1, 2016
@aG0aep6G
Copy link
Contributor

aG0aep6G commented Nov 1, 2016

On 11/01/2016 05:54 PM, Jack Stouffer wrote:

Conceptually what is the difference between making |calloc| |pure| and
making the creation of a new file |pure|? They both do IO, they can
both overwrite existing data, and their impure failures are relegated
to rare edge cases.

Can you elaborate on how you consider calloc doing IO? And calloc can
only overwrite existing data when you mistakenly consider freed memory
to be alive, right? That's invalid and means relying on undefined
behavior, no?

The explanation of D pure is simple.

Let's try to write one with this change in mind:

A pure function is a function which does not read or modify anything
outside of it's scope such as reading a global variable or doing IO.
But modifying function parameters which are taken by |ref| is allowed,
which allows one to modify a scope outside of it's scope. Pure
disallows IO, unless you're in a |debug| statement and unless you're
doing memory allocation, which also does IO and if you do too much of
it your program isn't pure because |malloc| will eventually return
|null|, but this is still allowed.

(Note: In the following I don't consider memory allocations to do IO.
You seem to have a different view on that.)

Naively: A pure function does not read or write mutable global
variables directly (without them being passed in parameters), and it
doesn't do IO.

Being pedantic, that excludes any kind of memory allocation, including
via the GC, because they necessarily use (more or less private) globals
for their book-keeping.

So maybe: A pure function does not do IO, and it does not
significantly mutate global variables. Any changes to global variables
done by pure functions are deemed insignificant, meaning the compiler
may ignore them when checking for elidable function calls.

Then there would be a paragraph about how the compiler must reject any
and all accesses of mutable globals when analyzing the body of a pure
function, because it cannot know if the access is "significant" or not.

This is getting rather complicated, for sure. And I probably missed
something.

The thing is that GC.malloc has been pure forever, while not being
purer than C's malloc. So we can either tweak the rules for pure so
that that's fine, and then it should be possible to make C's malloc
pure, too. Or we lose the pure GC.malloc. Or we special case
GC.malloc (along with new), but then we get the worst of both worlds:
complicated rules while still being very restricted with regards to
allocations.

I don't think this change is particularly bad, it's just this is
another step in a trend in D of features that no one really knows how
they work. See |inout|, |return|, |return ref|, |scope|, and now |pure|.

I'm afraid pure has been on that list for a while now.

@jmdavis
Copy link
Member

jmdavis commented Nov 1, 2016

@JackStouffer @aG0aep6G You're both complicating pure considerably. All that pure means is that the function can access global/static non-immutable state except via its arguments. You don't even need to consider stuff like I/O, because that's automatic as soon as you can't access mutable global state. As soon as you see pure on a function, you know that it's not accessing stuff that wasn't passed to it. That's what pure gives you. And the complications are stuff like whether the compiler is able to implicitly convert the return type to const or immutable, because it can guarantee that the return value was not passed into the function. pure itself is very simple.

The only two aspects of it that are arguably odd at all are that memory allocation is permitted and that debug is usable as a backdoor. debug is usable as a backdoor purely for debugging purposes, an there's nothing complicated about that. And memory allocation is permitted, because the heap itself is not part of the global state of the program. As far as the language is concerned, return 42; and return new int(42); both create and return new values, and there's no global state involved. Each call to new could even allocate from a different heap, and the language wouldn't know or care. But worst case, all you have to know is that allocation is permitted in pure functions, and if that seems like an exception to you, it's still the only one besides the explicit backdoor for debugging. So, it's still quite simple. And arguably, memory allocation really isn't even an exception.

All of the difficulty in marking stuff like C functions as pure comes from figuring out whether they could affect the global state of the program, and since they're outside of D, the programmer has to do that rather than the compiler telling them as is normally the case. And malloc can be pure for the same reasons that new can be.

Now, if you start worrying about exactly when optimizations can be made based on pure or whether implicit conversions can be made based on it, it does get quite complicated. But knowing when you can use pure and what it guarantees is quite simple. Even discussions of strongly pure vs weakly pure are overcomplicating the issue. They're really just a tool for figuring out when actual, mathematical, functional purity applies and what optimizations the compiler is able to do and aren't necessary for actually using and understanding pure.

@aG0aep6G
Copy link
Contributor

aG0aep6G commented Nov 1, 2016

@JackStouffer @aG0aep6G You're both complicating pure considerably. All that pure means is that the function can access global/static non-immutable state except via its arguments.

(Assuming you mean "that the function cannot access [...]".)

I'm afraid that's not true when GC.malloc is pure. It must make use of mutable globals for book-keeping, doesn't it?

You don't even need to consider stuff like I/O, because that's automatic as soon as you can't access mutable global state.

Depends on how "state" is defined, I guess. You can make output without accessing global variables, as you don't need them to make a syscall. Maybe that still involves touching global "state", but then we have to define "state" to include IO but not memory allocation. So we have consider IO at some point, no?

debug is usable as a backdoor purely for debugging purposes, an there's nothing complicated about that.

I agree that debug is not a problem, and that it's best handled as an exception. It just ignores pure completely, and that's ok.

And memory allocation is permitted, because the heap itself is not part of the global state of the program. As far as the language is concerned, return 42; and return new int(42); both create and return new values, and there's no global state involved.

Ok, now you're adding an exception to your "very simple" rule, making it not so simple anymore. I'm not opposed to doing it that way. State a simple base rule and then the exceptions to it; fine. But GC.malloc doesn't just allocate memory from the OS, it also has to do book-keeping via mutable globals.

In the end, "state" needs to be carefully defined if we want a pure GC.malloc that doesn't break the rules.

nordlow added a commit to nordlow/dlang.org that referenced this pull request Nov 1, 2016
Update according to dlang/druntime#1683

I'm not sure if this is enough or if I should update the general formulation of purity aswell.

Made an existing statement more clear via a neither-nor formulation.
@nordlow
Copy link
Contributor Author

nordlow commented Nov 1, 2016

@andralex Added dlang/dlang.org#1510

Do we need to extend the general explanation of purity aswell?

If so, I'd be happy to receive proposals or just a brief summary of what to include.

@jmdavis
Copy link
Member

jmdavis commented Nov 2, 2016

Depends on how "state" is defined, I guess. You can make output without accessing global variables, as you don't need them to make a syscall. Maybe that still involves touching global "state", but then we have to define "state" to include IO but not memory allocation. So we have consider IO at some point, no?

I/O either involves call non-pure C functions or accessing module-level variables such as std.stdio.stdout. So, it's automatically not pure. If you want a more precise definition of mutable, global state, then pure cannot access variables which are either static or at module-level (and thus implicitly static) unless they're either immutable or a const value type (and thus cannot possibly be mutated after they're initialized). And I/O clearly violates that.

Memory allocation is more of a grey area, but the functions in question are marked as pure, so whatever mutable, global state they may mess with underneath the hood is invisible, and because they do not violate the compiler guarantees for pure, it doesn't matter. And yes, understanding those compiler guarantees gets complicated, but for the most part, that really doesn't matter. A pure function can only call other pure functions, and it can't access any module-level or static variables unless they're immutable or are const value types. Your code can just call GC.malloc and not care, because it's marked as pure. Just like with an @trusted function, what it does internally is an implementation detail, and it really doesn't matter unless the person who marked it as pure screwed up and violated the compiler guarantees. And the only time that you have to worry about doing the equivalent of @trusted with pure is when your declaring an extern(C) function to be pure.

It's even normally simple to figure out whether a C function can be marked as pure. Are you sure that it doesn't access any global or static variables - either directly or via any of its function calls? Yes? Then in can be pure. If not, then don't mark it as pure. It's just when you try and insist that it be pure anyway that things get complicated, because then you have to be sure that you understand the guarantees that the compiler makes in order to be sure that marking the function as pure won't violate those guarantees. And that sort of thing should only be done by experts that understand the compiler guarantees, and even then, it's very rare that it's appropriate.

There's no question that understanding what the compiler does with pure is complicated, but actually using pure and knowing what it means is simple. A pure function is a function can only call other pure functions and cannot access module-level or static variables unless they are immutable or are const value types (and thus cannot be mutated after being initialized). That's simple. The confusion usually comes from the fact that folks expect functional purity rather than something that really should be called something like @noglobal or @nostatic. And often that devolves into discussions about "strong" and "weak" purity, because that matters for the compiler optimizations and when you get functional purity. If we'd have just called it @noglobal, then most of the discussions of "strong" and weak "purity" would go away, and we'd probably end up with something like 5% of the confusion we tend to get now.

@aG0aep6G
Copy link
Contributor

aG0aep6G commented Nov 2, 2016

On 11/02/2016 04:20 AM, Jonathan M Davis wrote:

Depends on how "state" is defined, I guess. You can make output
without accessing global variables, as you don't need them to make
a syscall. Maybe that still involves touching global "state", but
then we have to define "state" to include IO but not memory
allocation. So we have consider IO at some point, no?

I/O either involves call non-|pure| C functions or accessing
module-level variables such as |std.stdio.stdout|.

You can also make output by making the syscall in assembler:

void main()
{
     auto message = "Hello, world!\n";
     auto length = message.length;
     auto pointer = message.ptr;
     asm
     {
         // linux x86_64
         mov RAX, 1; // write
         mov RDI, 1; // stdout
         mov RSI, pointer;
         mov RDX, length;
         syscall;
     }
}

The code doesn't touch any globals. So if that was the only requirement,
the asm block and main could be marked pure.

Memory allocation is more of a grey area, but the functions in
question are marked as |pure|, so whatever mutable, global state they
may mess with underneath the hood is invisible, and because they do
not violate the compiler guarantees for |pure|, it doesn't matter. And
yes, understanding those compiler guarantees gets complicated, but for
the most part, that really doesn't matter. A |pure| function can only
call other |pure| functions, and it can't access any module-level or
|static| variables unless they're |immutable| or are |const| value
types. Your code can just call |GC.malloc| and not care, because it's
marked as |pure|. Just like with an |@trusted| function, what it does
internally is an implementation detail, and it really doesn't matter
unless the person who marked it as |pure| screwed up and violated the
compiler guarantees.

You're looking at it from the perspective of a D programmer, but we also
have to look at it from the perspective of a compiler dev, who ideally
should be able to go by the spec.

If the guarantee/requirement is just that a pure function cannot
access mutable globals, then GC.malloc violates that at the moment. It
relies on undefined behavior. As far as I see, the compiler could skip
allocating space for globals when it sees that all entry points (main,
static ctors) are pure, because the whole program cannot possibly make
use of them anyway, right?

It's even normally simple to figure out whether a C function can be
marked as |pure|. Are you sure that it doesn't access any global or
static variables - either directly or via any of its function calls?
Yes? Then in can be |pure|. If not, then don't mark it as |pure|. It's
just when you try and insist that it be |pure| anyway that things get
complicated, because then you have to be sure that you understand the
guarantees that the compiler makes in order to be sure that marking
the function as |pure| won't violate those guarantees. And that sort
of thing should only be done by experts that understand the compiler
guarantees, and even then, it's /very/ rare that it's appropriate.

Since there is no specification of dmd's additional guarantees, relying
on compiler guarantees means relying on undefined behavior, here. Maybe
we can get away with that by coupling druntime to dmd, but I think that
would be a mistake. druntime should ideally be independent of the
compiler. That's still a goal, isn't it? And blatantly breaking the
rules in core code just doesn't taste right. It gives the wrong
impression that it's ok to break the rules, and it's probably going to
bite us later. Famous last words: "What could possibly go wrong?"

@jmdavis
Copy link
Member

jmdavis commented Nov 2, 2016

If the guarantee/requirement is just that a pure function cannot access mutable globals, then GC.malloc violates that at the moment. It relies on undefined behavior. As far as I see, the compiler could skip allocating space for globals when it sees that all entry points (main, static ctors) are pure, because the whole program cannot possibly make use of them anyway, right?

It's clear from the function signature that optimizing away GC.malloc would be a problem. It returns void*, and it doesn't accept any pointers, so what it returns is either null, a pointer to a local variable (which is guaranteed to be broken, so there's no point in considering it), or it returns newly allocated memory. Yes, the compiler needs to understand that. And no, the spec is not precise enough about what kinds of optimizations are legal or expected. AFAIK, the spec doesn't even go into the fact that if the compiler can determine that the return value of a function has to be newly allocated, it can implicitly change the mutability of the return type.

The spec isn't even vaguely precise enough to write a compiler from without looking at what dmd does, and yes, druntime is tied to the dmd front-end. Long term, we definitely want a spec that's precise enough to write a compiler from without looking at what dmd does, and Andrei was talking at dconf about looking at doing that, but it hasn't happened yet. Realistically, the definition of D is a combination of the online docs, what dmd actually does, and what Walter says. And no, that's not ideal, but that's reality at the moment. Work has been done to improve the online spec, and there are at least two projects working on implementing a D compiler based on the spec, which has led to further improvements of the spec, but until someone with the right skillset actually writes a formal spec that Walter agrees to, we don't really have one. We have online documentation that we call a spec, but it's way too informal to function as one, and it doesn't come even close to having the level of detail that would be needed for a true spec.

@aG0aep6G
Copy link
Contributor

aG0aep6G commented Nov 2, 2016

On 11/02/2016 09:40 AM, Jonathan M Davis wrote:

The spec isn't even vaguely precise enough to write a compiler from
without looking at what dmd does, and yes, druntime is tied to the dmd
front-end. Long term, we definitely want a spec that's precise enough
to write a compiler from without looking at what dmd does, and Andrei
was talking at dconf about looking at doing that, but it hasn't
happened yet.

In addition to a better spec, we also want to decouple druntime from
dmd. A step towards those goals is a more precise definition of pure
that allows GC.malloc to be pure without breaking the rules.

@andralex
Copy link
Member

andralex commented Nov 2, 2016

@nordlow generally the more and detailed docs the better. In a way the good info in this PR will be wasted if not immortalized in the form of documentation. @nordlow would be awesome if you acted as a curator.

@jmdavis is basically right: yes, a pure function that takes no mutable data and returns a mutable array must receive special treatment. Unless it uses undefined trickery, it must by necessity return freshly allocated memory. The compiler doesn't need to look at the implementation of such a function to figure that it returns fresh memory. That it also affects global state in the sense that there's less memory left for the program etc. should not be a problem for purity analysis (after all, the user starting another process may limit the memory available to the currently running D process at any time). @jmdavis is also right we currently don't have a clear enough specification, but it seems to me the intuition points the right way here. @aG0aep6G is also right - this is all by principle, no trick, no undefined crap that happens to work. As a matter of principle, a function that creates fresh memory is weakly pure.

@ibuclaw
Copy link
Member

ibuclaw commented Nov 4, 2016

@ibuclaw we've gotten to a better understanding of the matters involved and fixed a bug in the compiler such that pure memory allocation functions are always considered weakly (not strongly) pure.

Do I assume right that you are looking for a way to reduce dead GC and stdlib memory allocations? If so, I agree that we should, However I'm not sure about using pure to achieve this. Then again, I'm not really in favour of adding a new attribute either.

@jmdavis
Copy link
Member

jmdavis commented Nov 5, 2016

@ibuclaw If we can do memory allocation in pure functions with the GC and not with malloc, then that's going to either seriously hamper pure in code that's trying to minimize GC-use, and/or it's going to reduce how often folks are willing to use malloc instead of new. And conceptually, the only real difference between new and malloc as far as allocation goes is that you have to explicitly free malloc-ed memory to avoid leaks, making it pretty weird that new can be pure but malloc can't - though needing to free stuff does throw a wrench into purity as well, and it's not necessarily clear that we can get away with making free pure. I really don't see much argument for not having malloc be pure when new is. The arguments against malloc being pure apply just as well to new, and new has been pure for quite a while now (maybe since the beginning of pure - I don't recall for sure at the moment).

@ibuclaw
Copy link
Member

ibuclaw commented Nov 6, 2016

@jmdavis - I was thinking about from the optimizer point of view. But for me it is easier to reason about a GC new operation as being pure. A variable that is unused apart from its initial GC assignment to me stands out as an opportunity to eliminate the call entirely. If memory returned from the GC isn't set or read, it will just be recycled anyway.

An typical example I used to see in gdc until fixed was when a function that creates a closure gets inlined, then const-folded away. However the apart from the return result, you're also left with a GC malloc call to initialize the now unused closure pointer. The latter was never discarded because the backend assumed the worst about any side effects such a call may have.

@nemanja-boric-sociomantic
Copy link
Contributor

It's even normally simple to figure out whether a C function can be marked as pure. Are you sure that it doesn't access any global or static variables - either directly or via any of its function calls? Yes? Then in can be pure. If not, then don't mark it as pure.

Late to the party, but since malloc and friends can set errno to ENOMEM, why is this marked pure?

@andralex
Copy link
Member

@nemanja-boric-sociomantic See dlang/dlang.org#1528, which requires malloc to be called (no memoization). Then we need to figure what the impact of it setting a global is; I'm unclear on that right now.

@nordlow nordlow deleted the pure-stdlib-alloc branch October 2, 2018 07:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants